Overview

Dataset statistics

Number of variables22
Number of observations3096
Missing cells11152
Missing cells (%)16.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory532.3 KiB
Average record size in memory176.0 B

Variable types

Numeric11
Categorical11

Alerts

State Name has constant value ""Constant
District Name is highly overall correlated with Water_Body_Nature and 2 other fieldsHigh correlation
Original_Storage_Capacity is highly overall correlated with Present_Storage_Capacity and 1 other fieldsHigh correlation
Present_Storage_Capacity is highly overall correlated with Original_Storage_Capacity and 1 other fieldsHigh correlation
Reason_for_Water_Body_Use is highly overall correlated with Water_Body_Nature and 2 other fieldsHigh correlation
Renovation_Year is highly overall correlated with construcion_yearHigh correlation
Water_Body_Nature is highly overall correlated with District Name and 2 other fieldsHigh correlation
Water_body_in_use is highly overall correlated with District Name and 4 other fieldsHigh correlation
construcion_year is highly overall correlated with Renovation_YearHigh correlation
construction_cost is highly overall correlated with renovation_costHigh correlation
df_index is highly overall correlated with level_0High correlation
filled_up_storage_name is highly overall correlated with Water_body_in_use and 1 other fieldsHigh correlation
filled_up_storage_space_name is highly overall correlated with Water_body_in_use and 1 other fieldsHigh correlation
level_0 is highly overall correlated with df_indexHigh correlation
no_people_benefited_by_water_body is highly overall correlated with Reason_for_Water_Body_Use and 1 other fieldsHigh correlation
population_density_benefited is highly overall correlated with renovation_costHigh correlation
renovation_cost is highly overall correlated with District Name and 2 other fieldsHigh correlation
storage_capacity_change is highly overall correlated with Original_Storage_Capacity and 2 other fieldsHigh correlation
Area_Type is highly imbalanced (75.5%)Imbalance
Water_Body_Type is highly imbalanced (65.0%)Imbalance
Repair_Renovation_Status is highly imbalanced (97.1%)Imbalance
construcion_year has 1654 (53.4%) missing valuesMissing
construction_cost has 1654 (53.4%) missing valuesMissing
Renovation_Year has 2789 (90.1%) missing valuesMissing
renovation_cost has 2789 (90.1%) missing valuesMissing
filled_up_storage_name has 46 (1.5%) missing valuesMissing
filled_up_storage_space_name has 46 (1.5%) missing valuesMissing
reason_water_body_in_use_name2 has 2174 (70.2%) missing valuesMissing
construction_cost is highly skewed (γ1 = 29.2937204)Skewed
Original_Storage_Capacity is highly skewed (γ1 = 55.58769145)Skewed
Present_Storage_Capacity is highly skewed (γ1 = 55.59527059)Skewed
no_people_benefited_by_water_body is highly skewed (γ1 = 22.83601348)Skewed
storage_capacity_change is highly skewed (γ1 = -55.52042306)Skewed
population_density_benefited is highly skewed (γ1 = 55.092668)Skewed
level_0 is uniformly distributedUniform
df_index is uniformly distributedUniform
level_0 has unique valuesUnique
df_index has unique valuesUnique
Original_Storage_Capacity has 46 (1.5%) zerosZeros
Present_Storage_Capacity has 46 (1.5%) zerosZeros
storage_capacity_change has 518 (16.7%) zerosZeros

Reproduction

Analysis started2023-12-09 13:04:39.927157
Analysis finished2023-12-09 13:04:52.810204
Duration12.88 seconds
Software versionydata-profiling vv4.6.2
Download configurationconfig.json

Variables

level_0
Real number (ℝ)

HIGH CORRELATION  UNIFORM  UNIQUE 

Distinct3096
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1547.5
Minimum0
Maximum3095
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size24.3 KiB
2023-12-09T18:34:52.868622image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile154.75
Q1773.75
median1547.5
Q32321.25
95-th percentile2940.25
Maximum3095
Range3095
Interquartile range (IQR)1547.5

Descriptive statistics

Standard deviation893.88254
Coefficient of variation (CV)0.57763008
Kurtosis-1.2
Mean1547.5
Median Absolute Deviation (MAD)774
Skewness0
Sum4791060
Variance799026
MonotonicityStrictly increasing
2023-12-09T18:34:52.983320image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1
 
< 0.1%
2032 1
 
< 0.1%
2058 1
 
< 0.1%
2059 1
 
< 0.1%
2060 1
 
< 0.1%
2061 1
 
< 0.1%
2062 1
 
< 0.1%
2063 1
 
< 0.1%
2064 1
 
< 0.1%
2065 1
 
< 0.1%
Other values (3086) 3086
99.7%
ValueCountFrequency (%)
0 1
< 0.1%
1 1
< 0.1%
2 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
ValueCountFrequency (%)
3095 1
< 0.1%
3094 1
< 0.1%
3093 1
< 0.1%
3092 1
< 0.1%
3091 1
< 0.1%
3090 1
< 0.1%
3089 1
< 0.1%
3088 1
< 0.1%
3087 1
< 0.1%
3086 1
< 0.1%

df_index
Real number (ℝ)

HIGH CORRELATION  UNIFORM  UNIQUE 

Distinct3096
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1547.5
Minimum0
Maximum3095
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size24.3 KiB
2023-12-09T18:34:53.095821image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile154.75
Q1773.75
median1547.5
Q32321.25
95-th percentile2940.25
Maximum3095
Range3095
Interquartile range (IQR)1547.5

Descriptive statistics

Standard deviation893.88254
Coefficient of variation (CV)0.57763008
Kurtosis-1.2
Mean1547.5
Median Absolute Deviation (MAD)774
Skewness0
Sum4791060
Variance799026
MonotonicityStrictly increasing
2023-12-09T18:34:53.208393image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1
 
< 0.1%
2032 1
 
< 0.1%
2058 1
 
< 0.1%
2059 1
 
< 0.1%
2060 1
 
< 0.1%
2061 1
 
< 0.1%
2062 1
 
< 0.1%
2063 1
 
< 0.1%
2064 1
 
< 0.1%
2065 1
 
< 0.1%
Other values (3086) 3086
99.7%
ValueCountFrequency (%)
0 1
< 0.1%
1 1
< 0.1%
2 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
ValueCountFrequency (%)
3095 1
< 0.1%
3094 1
< 0.1%
3093 1
< 0.1%
3092 1
< 0.1%
3091 1
< 0.1%
3090 1
< 0.1%
3089 1
< 0.1%
3088 1
< 0.1%
3087 1
< 0.1%
3086 1
< 0.1%

Area_Type
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size24.3 KiB
Rural
2970 
Urban
 
126

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters15480
Distinct characters8
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowRural
2nd rowRural
3rd rowRural
4th rowRural
5th rowRural

Common Values

ValueCountFrequency (%)
Rural 2970
95.9%
Urban 126
 
4.1%

Length

2023-12-09T18:34:53.496287image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-09T18:34:53.574800image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
rural 2970
95.9%
urban 126
 
4.1%

Most occurring characters

ValueCountFrequency (%)
r 3096
20.0%
a 3096
20.0%
R 2970
19.2%
u 2970
19.2%
l 2970
19.2%
U 126
 
0.8%
b 126
 
0.8%
n 126
 
0.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 12384
80.0%
Uppercase Letter 3096
 
20.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 3096
25.0%
a 3096
25.0%
u 2970
24.0%
l 2970
24.0%
b 126
 
1.0%
n 126
 
1.0%
Uppercase Letter
ValueCountFrequency (%)
R 2970
95.9%
U 126
 
4.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 15480
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 3096
20.0%
a 3096
20.0%
R 2970
19.2%
u 2970
19.2%
l 2970
19.2%
U 126
 
0.8%
b 126
 
0.8%
n 126
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 15480
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 3096
20.0%
a 3096
20.0%
R 2970
19.2%
u 2970
19.2%
l 2970
19.2%
U 126
 
0.8%
b 126
 
0.8%
n 126
 
0.8%

State Name
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size24.3 KiB
UTTARAKHAND
3096 

Length

Max length11
Median length11
Mean length11
Min length11

Characters and Unicode

Total characters34056
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUTTARAKHAND
2nd rowUTTARAKHAND
3rd rowUTTARAKHAND
4th rowUTTARAKHAND
5th rowUTTARAKHAND

Common Values

ValueCountFrequency (%)
UTTARAKHAND 3096
100.0%

Length

2023-12-09T18:34:53.665997image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-09T18:34:53.747421image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
uttarakhand 3096
100.0%

Most occurring characters

ValueCountFrequency (%)
A 9288
27.3%
T 6192
18.2%
U 3096
 
9.1%
R 3096
 
9.1%
K 3096
 
9.1%
H 3096
 
9.1%
N 3096
 
9.1%
D 3096
 
9.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 34056
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 9288
27.3%
T 6192
18.2%
U 3096
 
9.1%
R 3096
 
9.1%
K 3096
 
9.1%
H 3096
 
9.1%
N 3096
 
9.1%
D 3096
 
9.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 34056
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 9288
27.3%
T 6192
18.2%
U 3096
 
9.1%
R 3096
 
9.1%
K 3096
 
9.1%
H 3096
 
9.1%
N 3096
 
9.1%
D 3096
 
9.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 34056
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 9288
27.3%
T 6192
18.2%
U 3096
 
9.1%
R 3096
 
9.1%
K 3096
 
9.1%
H 3096
 
9.1%
N 3096
 
9.1%
D 3096
 
9.1%

District Name
Categorical

HIGH CORRELATION 

Distinct13
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size24.3 KiB
HARIDWAR
1065 
UDHAM SINGH NAGAR
897 
DEHRADUN
210 
ALMORA
178 
CHAMPAWAT
143 
Other values (8)
603 

Length

Max length17
Median length11
Mean length10.51938
Min length5

Characters and Unicode

Total characters32568
Distinct characters21
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUDHAM SINGH NAGAR
2nd rowUDHAM SINGH NAGAR
3rd rowUDHAM SINGH NAGAR
4th rowUDHAM SINGH NAGAR
5th rowHARIDWAR

Common Values

ValueCountFrequency (%)
HARIDWAR 1065
34.4%
UDHAM SINGH NAGAR 897
29.0%
DEHRADUN 210
 
6.8%
ALMORA 178
 
5.7%
CHAMPAWAT 143
 
4.6%
BAGESHWAR 115
 
3.7%
PITHORGARH 93
 
3.0%
PAURI 80
 
2.6%
NANITAL 79
 
2.6%
TEHRI 75
 
2.4%
Other values (3) 161
 
5.2%

Length

2023-12-09T18:34:53.840723image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
haridwar 1065
21.8%
udham 897
18.3%
singh 897
18.3%
nagar 897
18.3%
dehradun 210
 
4.3%
almora 178
 
3.6%
champawat 143
 
2.9%
bageshwar 115
 
2.4%
pithorgarh 93
 
1.9%
pauri 80
 
1.6%
Other values (5) 315
 
6.4%

Most occurring characters

ValueCountFrequency (%)
A 6690
20.5%
R 4079
12.5%
H 3693
11.3%
D 2438
 
7.5%
I 2394
 
7.4%
N 2162
 
6.6%
G 2058
 
6.3%
1794
 
5.5%
W 1323
 
4.1%
U 1283
 
3.9%
Other values (11) 4654
14.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 30774
94.5%
Space Separator 1794
 
5.5%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 6690
21.7%
R 4079
13.3%
H 3693
12.0%
D 2438
 
7.9%
I 2394
 
7.8%
N 2162
 
7.0%
G 2058
 
6.7%
W 1323
 
4.3%
U 1283
 
4.2%
M 1283
 
4.2%
Other values (10) 3371
11.0%
Space Separator
ValueCountFrequency (%)
1794
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 30774
94.5%
Common 1794
 
5.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 6690
21.7%
R 4079
13.3%
H 3693
12.0%
D 2438
 
7.9%
I 2394
 
7.8%
N 2162
 
7.0%
G 2058
 
6.7%
W 1323
 
4.3%
U 1283
 
4.2%
M 1283
 
4.2%
Other values (10) 3371
11.0%
Common
ValueCountFrequency (%)
1794
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 32568
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 6690
20.5%
R 4079
12.5%
H 3693
11.3%
D 2438
 
7.5%
I 2394
 
7.4%
N 2162
 
6.6%
G 2058
 
6.3%
1794
 
5.5%
W 1323
 
4.1%
U 1283
 
3.9%
Other values (11) 4654
14.3%

Water_Body_Type
Categorical

IMBALANCE 

Distinct6
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size24.3 KiB
Ponds
2514 
Tank
461 
Lakes
 
48
Water consv schemes/percolation tanks/check-dams
 
41
Reservoirs
 
27

Length

Max length48
Median length5
Mean length5.4657623
Min length4

Characters and Unicode

Total characters16922
Distinct characters25
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPonds
2nd rowPonds
3rd rowPonds
4th rowPonds
5th rowPonds

Common Values

ValueCountFrequency (%)
Ponds 2514
81.2%
Tank 461
 
14.9%
Lakes 48
 
1.6%
Water consv schemes/percolation tanks/check-dams 41
 
1.3%
Reservoirs 27
 
0.9%
Others 5
 
0.2%

Length

2023-12-09T18:34:53.943386image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-09T18:34:54.042140image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
ponds 2514
78.1%
tank 461
 
14.3%
lakes 48
 
1.5%
water 41
 
1.3%
consv 41
 
1.3%
schemes/percolation 41
 
1.3%
tanks/check-dams 41
 
1.3%
reservoirs 27
 
0.8%
others 5
 
0.2%

Most occurring characters

ValueCountFrequency (%)
n 3098
18.3%
s 2826
16.7%
o 2664
15.7%
d 2555
15.1%
P 2514
14.9%
a 673
 
4.0%
k 591
 
3.5%
T 461
 
2.7%
e 312
 
1.8%
c 205
 
1.2%
Other values (15) 1023
 
6.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 13580
80.3%
Uppercase Letter 3096
 
18.3%
Space Separator 123
 
0.7%
Other Punctuation 82
 
0.5%
Dash Punctuation 41
 
0.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 3098
22.8%
s 2826
20.8%
o 2664
19.6%
d 2555
18.8%
a 673
 
5.0%
k 591
 
4.4%
e 312
 
2.3%
c 205
 
1.5%
r 141
 
1.0%
t 128
 
0.9%
Other values (6) 387
 
2.8%
Uppercase Letter
ValueCountFrequency (%)
P 2514
81.2%
T 461
 
14.9%
L 48
 
1.6%
W 41
 
1.3%
R 27
 
0.9%
O 5
 
0.2%
Space Separator
ValueCountFrequency (%)
123
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 82
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 41
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 16676
98.5%
Common 246
 
1.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 3098
18.6%
s 2826
16.9%
o 2664
16.0%
d 2555
15.3%
P 2514
15.1%
a 673
 
4.0%
k 591
 
3.5%
T 461
 
2.8%
e 312
 
1.9%
c 205
 
1.2%
Other values (12) 777
 
4.7%
Common
ValueCountFrequency (%)
123
50.0%
/ 82
33.3%
- 41
 
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 16922
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 3098
18.3%
s 2826
16.7%
o 2664
15.7%
d 2555
15.1%
P 2514
14.9%
a 673
 
4.0%
k 591
 
3.5%
T 461
 
2.7%
e 312
 
1.8%
c 205
 
1.2%
Other values (15) 1023
 
6.0%

Water_body_in_use
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size24.3 KiB
1
2371 
0
725 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters3096
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row1

Common Values

ValueCountFrequency (%)
1 2371
76.6%
0 725
 
23.4%

Length

2023-12-09T18:34:54.153980image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-09T18:34:54.239647image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
1 2371
76.6%
0 725
 
23.4%

Most occurring characters

ValueCountFrequency (%)
1 2371
76.6%
0 725
 
23.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3096
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 2371
76.6%
0 725
 
23.4%

Most occurring scripts

ValueCountFrequency (%)
Common 3096
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 2371
76.6%
0 725
 
23.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3096
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 2371
76.6%
0 725
 
23.4%

Reason_for_Water_Body_Use
Categorical

HIGH CORRELATION 

Distinct9
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size24.3 KiB
Ground water recharge
1267 
other
725 
Pisciculture
611 
Irrigation
322 
Other
 
101
Other values (4)
 
70

Length

Max length21
Median length17
Mean length13.593023
Min length5

Characters and Unicode

Total characters42084
Distinct characters25
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowother
2nd rowother
3rd rowother
4th rowother
5th rowGround water recharge

Common Values

ValueCountFrequency (%)
Ground water recharge 1267
40.9%
other 725
23.4%
Pisciculture 611
19.7%
Irrigation 322
 
10.4%
Other 101
 
3.3%
Recreation 26
 
0.8%
Religious 17
 
0.5%
Domestic/Drinking 16
 
0.5%
Industrial 11
 
0.4%

Length

2023-12-09T18:34:54.340153image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-09T18:34:54.454243image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
ground 1267
22.5%
water 1267
22.5%
recharge 1267
22.5%
other 826
14.7%
pisciculture 611
10.9%
irrigation 322
 
5.7%
recreation 26
 
0.5%
religious 17
 
0.3%
domestic/drinking 16
 
0.3%
industrial 11
 
0.2%

Most occurring characters

ValueCountFrequency (%)
r 7202
17.1%
e 5323
12.6%
t 3079
 
7.3%
a 2893
 
6.9%
2534
 
6.0%
c 2531
 
6.0%
u 2517
 
6.0%
o 2373
 
5.6%
h 2093
 
5.0%
i 1985
 
4.7%
Other values (15) 9554
22.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 37147
88.3%
Space Separator 2534
 
6.0%
Uppercase Letter 2387
 
5.7%
Other Punctuation 16
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 7202
19.4%
e 5323
14.3%
t 3079
8.3%
a 2893
7.8%
c 2531
 
6.8%
u 2517
 
6.8%
o 2373
 
6.4%
h 2093
 
5.6%
i 1985
 
5.3%
n 1658
 
4.5%
Other values (7) 5493
14.8%
Uppercase Letter
ValueCountFrequency (%)
G 1267
53.1%
P 611
25.6%
I 333
 
14.0%
O 101
 
4.2%
R 43
 
1.8%
D 32
 
1.3%
Space Separator
ValueCountFrequency (%)
2534
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 16
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 39534
93.9%
Common 2550
 
6.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 7202
18.2%
e 5323
13.5%
t 3079
 
7.8%
a 2893
 
7.3%
c 2531
 
6.4%
u 2517
 
6.4%
o 2373
 
6.0%
h 2093
 
5.3%
i 1985
 
5.0%
n 1658
 
4.2%
Other values (13) 7880
19.9%
Common
ValueCountFrequency (%)
2534
99.4%
/ 16
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 42084
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 7202
17.1%
e 5323
12.6%
t 3079
 
7.3%
a 2893
 
6.9%
2534
 
6.0%
c 2531
 
6.0%
u 2517
 
6.0%
o 2373
 
5.6%
h 2093
 
5.0%
i 1985
 
4.7%
Other values (15) 9554
22.7%

Water_Body_Nature
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size24.3 KiB
Natural
1654 
Man-made
1442 

Length

Max length8
Median length7
Mean length7.4657623
Min length7

Characters and Unicode

Total characters23114
Distinct characters12
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNatural
2nd rowMan-made
3rd rowMan-made
4th rowMan-made
5th rowNatural

Common Values

ValueCountFrequency (%)
Natural 1654
53.4%
Man-made 1442
46.6%

Length

2023-12-09T18:34:54.588002image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-09T18:34:54.678498image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
natural 1654
53.4%
man-made 1442
46.6%

Most occurring characters

ValueCountFrequency (%)
a 6192
26.8%
N 1654
 
7.2%
t 1654
 
7.2%
u 1654
 
7.2%
r 1654
 
7.2%
l 1654
 
7.2%
M 1442
 
6.2%
n 1442
 
6.2%
- 1442
 
6.2%
m 1442
 
6.2%
Other values (2) 2884
12.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 18576
80.4%
Uppercase Letter 3096
 
13.4%
Dash Punctuation 1442
 
6.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 6192
33.3%
t 1654
 
8.9%
u 1654
 
8.9%
r 1654
 
8.9%
l 1654
 
8.9%
n 1442
 
7.8%
m 1442
 
7.8%
d 1442
 
7.8%
e 1442
 
7.8%
Uppercase Letter
ValueCountFrequency (%)
N 1654
53.4%
M 1442
46.6%
Dash Punctuation
ValueCountFrequency (%)
- 1442
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 21672
93.8%
Common 1442
 
6.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 6192
28.6%
N 1654
 
7.6%
t 1654
 
7.6%
u 1654
 
7.6%
r 1654
 
7.6%
l 1654
 
7.6%
M 1442
 
6.7%
n 1442
 
6.7%
m 1442
 
6.7%
d 1442
 
6.7%
Common
ValueCountFrequency (%)
- 1442
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 23114
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 6192
26.8%
N 1654
 
7.2%
t 1654
 
7.2%
u 1654
 
7.2%
r 1654
 
7.2%
l 1654
 
7.2%
M 1442
 
6.2%
n 1442
 
6.2%
- 1442
 
6.2%
m 1442
 
6.2%
Other values (2) 2884
12.5%

construcion_year
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct42
Distinct (%)2.9%
Missing1654
Missing (%)53.4%
Infinite0
Infinite (%)0.0%
Mean2013.4612
Minimum1905
Maximum2020
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size24.3 KiB
2023-12-09T18:34:54.778832image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum1905
5-th percentile2001
Q12014
median2016
Q32017
95-th percentile2019
Maximum2020
Range115
Interquartile range (IQR)3

Descriptive statistics

Standard deviation7.9539063
Coefficient of variation (CV)0.0039503649
Kurtosis44.119486
Mean2013.4612
Median Absolute Deviation (MAD)2
Skewness-5.1630476
Sum2903411
Variance63.264625
MonotonicityNot monotonic
2023-12-09T18:34:54.906183image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=42)
ValueCountFrequency (%)
2017 334
 
10.8%
2016 213
 
6.9%
2018 173
 
5.6%
2015 157
 
5.1%
2014 118
 
3.8%
2019 72
 
2.3%
2002 44
 
1.4%
2013 40
 
1.3%
2010 40
 
1.3%
2005 35
 
1.1%
Other values (32) 216
 
7.0%
(Missing) 1654
53.4%
ValueCountFrequency (%)
1905 1
< 0.1%
1936 1
< 0.1%
1947 1
< 0.1%
1950 2
0.1%
1960 1
< 0.1%
1962 1
< 0.1%
1965 1
< 0.1%
1967 1
< 0.1%
1970 1
< 0.1%
1972 1
< 0.1%
ValueCountFrequency (%)
2020 20
 
0.6%
2019 72
 
2.3%
2018 173
5.6%
2017 334
10.8%
2016 213
6.9%
2015 157
5.1%
2014 118
 
3.8%
2013 40
 
1.3%
2012 34
 
1.1%
2011 17
 
0.5%

construction_cost
Real number (ℝ)

HIGH CORRELATION  MISSING  SKEWED 

Distinct184
Distinct (%)12.8%
Missing1654
Missing (%)53.4%
Infinite0
Infinite (%)0.0%
Mean12413710
Minimum0
Maximum8.39 × 109
Zeros3
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size24.3 KiB
2023-12-09T18:34:55.035697image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile24050
Q161250
median90000
Q3200000
95-th percentile437850
Maximum8.39 × 109
Range8.39 × 109
Interquartile range (IQR)138750

Descriptive statistics

Standard deviation2.4528547 × 108
Coefficient of variation (CV)19.759239
Kurtosis959.95713
Mean12413710
Median Absolute Deviation (MAD)40000
Skewness29.29372
Sum1.790057 × 1010
Variance6.016496 × 1016
MonotonicityNot monotonic
2023-12-09T18:34:55.158659image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
100000 161
 
5.2%
80000 145
 
4.7%
90000 112
 
3.6%
200000 104
 
3.4%
50000 92
 
3.0%
150000 75
 
2.4%
40000 65
 
2.1%
250000 64
 
2.1%
300000 50
 
1.6%
60000 48
 
1.6%
Other values (174) 526
 
17.0%
(Missing) 1654
53.4%
ValueCountFrequency (%)
0 3
 
0.1%
1 19
0.6%
2 1
 
< 0.1%
50 3
 
0.1%
1850 1
 
< 0.1%
6000 1
 
< 0.1%
10000 5
 
0.2%
11000 1
 
< 0.1%
12000 4
 
0.1%
14000 1
 
< 0.1%
ValueCountFrequency (%)
8390000000 1
< 0.1%
2466900000 1
< 0.1%
2400000000 1
< 0.1%
1500000000 1
< 0.1%
1200000000 1
< 0.1%
900000000 1
< 0.1%
380400000 1
< 0.1%
168316000 1
< 0.1%
73800000 1
< 0.1%
36000000 1
< 0.1%

Renovation_Year
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct19
Distinct (%)6.2%
Missing2789
Missing (%)90.1%
Infinite0
Infinite (%)0.0%
Mean2013.7101
Minimum1995
Maximum2020
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size24.3 KiB
2023-12-09T18:34:55.267736image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum1995
5-th percentile2008
Q12012
median2015
Q32017
95-th percentile2018
Maximum2020
Range25
Interquartile range (IQR)5

Descriptive statistics

Standard deviation3.7838661
Coefficient of variation (CV)0.0018790521
Kurtosis2.2060216
Mean2013.7101
Median Absolute Deviation (MAD)3
Skewness-1.0220579
Sum618209
Variance14.317643
MonotonicityNot monotonic
2023-12-09T18:34:55.371784image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=19)
ValueCountFrequency (%)
2012 67
 
2.2%
2017 61
 
2.0%
2016 39
 
1.3%
2010 31
 
1.0%
2015 23
 
0.7%
2008 21
 
0.7%
2018 18
 
0.6%
2019 10
 
0.3%
2013 9
 
0.3%
2011 6
 
0.2%
Other values (9) 22
 
0.7%
(Missing) 2789
90.1%
ValueCountFrequency (%)
1995 1
 
< 0.1%
1998 1
 
< 0.1%
2000 1
 
< 0.1%
2002 1
 
< 0.1%
2005 3
 
0.1%
2006 2
 
0.1%
2008 21
0.7%
2009 2
 
0.1%
2010 31
1.0%
2011 6
 
0.2%
ValueCountFrequency (%)
2020 5
 
0.2%
2019 10
 
0.3%
2018 18
 
0.6%
2017 61
2.0%
2016 39
1.3%
2015 23
 
0.7%
2014 6
 
0.2%
2013 9
 
0.3%
2012 67
2.2%
2011 6
 
0.2%

renovation_cost
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct76
Distinct (%)24.8%
Missing2789
Missing (%)90.1%
Infinite0
Infinite (%)0.0%
Mean352597.3
Minimum1000
Maximum46300000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size24.3 KiB
2023-12-09T18:34:55.491446image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum1000
5-th percentile5000
Q18000
median20000
Q380000
95-th percentile447800
Maximum46300000
Range46299000
Interquartile range (IQR)72000

Descriptive statistics

Standard deviation2973615.1
Coefficient of variation (CV)8.4334597
Kurtosis194.76923
Mean352597.3
Median Absolute Deviation (MAD)14000
Skewness13.365617
Sum1.0824737 × 108
Variance8.8423868 × 1012
MonotonicityNot monotonic
2023-12-09T18:34:55.616473image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10000 45
 
1.5%
20000 33
 
1.1%
5000 33
 
1.1%
8000 23
 
0.7%
100000 19
 
0.6%
6000 12
 
0.4%
25000 11
 
0.4%
50000 9
 
0.3%
200000 8
 
0.3%
3000 8
 
0.3%
Other values (66) 106
 
3.4%
(Missing) 2789
90.1%
ValueCountFrequency (%)
1000 2
 
0.1%
2000 2
 
0.1%
3000 8
 
0.3%
4000 1
 
< 0.1%
5000 33
1.1%
5600 1
 
< 0.1%
6000 12
 
0.4%
7000 1
 
< 0.1%
8000 23
0.7%
8010 1
 
< 0.1%
ValueCountFrequency (%)
46300000 1
< 0.1%
20500000 1
< 0.1%
12000000 1
< 0.1%
4650304 1
< 0.1%
2000000 1
< 0.1%
1442000 1
< 0.1%
1390000 1
< 0.1%
900000 1
< 0.1%
852000 1
< 0.1%
780000 1
< 0.1%

Repair_Renovation_Status
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size24.3 KiB
0
3087 
1
 
9

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters3096
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 3087
99.7%
1 9
 
0.3%

Length

2023-12-09T18:34:55.726671image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-09T18:34:55.812403image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
0 3087
99.7%
1 9
 
0.3%

Most occurring characters

ValueCountFrequency (%)
0 3087
99.7%
1 9
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3096
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 3087
99.7%
1 9
 
0.3%

Most occurring scripts

ValueCountFrequency (%)
Common 3096
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 3087
99.7%
1 9
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3096
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 3087
99.7%
1 9
 
0.3%

Original_Storage_Capacity
Real number (ℝ)

HIGH CORRELATION  SKEWED  ZEROS 

Distinct1346
Distinct (%)43.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1817741.5
Minimum0
Maximum5.3367534 × 109
Zeros46
Zeros (%)1.5%
Negative0
Negative (%)0.0%
Memory size24.3 KiB
2023-12-09T18:34:55.911923image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile12
Q1150
median1849
Q36000
95-th percentile27000
Maximum5.3367534 × 109
Range5.3367534 × 109
Interquartile range (IQR)5850

Descriptive statistics

Standard deviation95942479
Coefficient of variation (CV)52.781146
Kurtosis3091.9521
Mean1817741.5
Median Absolute Deviation (MAD)1793
Skewness55.587691
Sum5.6277276 × 109
Variance9.2049593 × 1015
MonotonicityNot monotonic
2023-12-09T18:34:56.044710image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12 109
 
3.5%
50 55
 
1.8%
40 48
 
1.6%
0 46
 
1.5%
300 44
 
1.4%
200 42
 
1.4%
100 39
 
1.3%
2400 38
 
1.2%
30 34
 
1.1%
60 32
 
1.0%
Other values (1336) 2609
84.3%
ValueCountFrequency (%)
0 46
1.5%
1 3
 
0.1%
2 2
 
0.1%
4 1
 
< 0.1%
9 1
 
< 0.1%
10 1
 
< 0.1%
11 1
 
< 0.1%
12 109
3.5%
15 3
 
0.1%
16 3
 
0.1%
ValueCountFrequency (%)
5336753400 1
< 0.1%
97560000 1
< 0.1%
67680000 1
< 0.1%
59820000 1
< 0.1%
30000002 1
< 0.1%
7504000 1
< 0.1%
1600000 1
< 0.1%
1352000 1
< 0.1%
985159 1
< 0.1%
708300 1
< 0.1%

Present_Storage_Capacity
Real number (ℝ)

HIGH CORRELATION  SKEWED  ZEROS 

Distinct1176
Distinct (%)38.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean899714.91
Minimum0
Maximum2.6683767 × 109
Zeros46
Zeros (%)1.5%
Negative0
Negative (%)0.0%
Memory size24.3 KiB
2023-12-09T18:34:56.171274image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile10
Q1100
median1031.5
Q33550.75
95-th percentile18000
Maximum2.6683767 × 109
Range2.6683767 × 109
Interquartile range (IQR)3450.75

Descriptive statistics

Standard deviation47969255
Coefficient of variation (CV)53.316061
Kurtosis3092.507
Mean899714.91
Median Absolute Deviation (MAD)991.5
Skewness55.595271
Sum2.7855174 × 109
Variance2.3010494 × 1015
MonotonicityNot monotonic
2023-12-09T18:34:56.291462image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10 142
 
4.6%
50 64
 
2.1%
100 63
 
2.0%
0 46
 
1.5%
20 45
 
1.5%
1200 45
 
1.5%
200 43
 
1.4%
40 42
 
1.4%
300 37
 
1.2%
1 35
 
1.1%
Other values (1166) 2534
81.8%
ValueCountFrequency (%)
0 46
 
1.5%
1 35
 
1.1%
2 3
 
0.1%
3 1
 
< 0.1%
5 9
 
0.3%
6 1
 
< 0.1%
8 3
 
0.1%
10 142
4.6%
12 1
 
< 0.1%
14 1
 
< 0.1%
ValueCountFrequency (%)
2668376700 1
< 0.1%
59820000 1
< 0.1%
16260000 1
< 0.1%
11280000 1
< 0.1%
6015600 1
< 0.1%
2500000 1
< 0.1%
1500000 1
< 0.1%
1352000 1
< 0.1%
895599 1
< 0.1%
700000 1
< 0.1%

filled_up_storage_name
Categorical

HIGH CORRELATION  MISSING 

Distinct5
Distinct (%)0.2%
Missing46
Missing (%)1.5%
Memory size24.3 KiB
Full
1396 
Upto 1/2
607 
Upto 3/4
520 
Nil/Negligible filled up
302 
Upto 1/4
225 

Length

Max length24
Median length8
Mean length7.7534426
Min length4

Characters and Unicode

Total characters23648
Distinct characters20
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNil/Negligible filled up
2nd rowNil/Negligible filled up
3rd rowUpto 1/4
4th rowUpto 1/2
5th rowNil/Negligible filled up

Common Values

ValueCountFrequency (%)
Full 1396
45.1%
Upto 1/2 607
19.6%
Upto 3/4 520
 
16.8%
Nil/Negligible filled up 302
 
9.8%
Upto 1/4 225
 
7.3%
(Missing) 46
 
1.5%

Length

2023-12-09T18:34:56.407671image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-09T18:34:56.502034image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
full 1396
27.9%
upto 1352
27.0%
1/2 607
12.1%
3/4 520
 
10.4%
nil/negligible 302
 
6.0%
filled 302
 
6.0%
up 302
 
6.0%
1/4 225
 
4.5%

Most occurring characters

ValueCountFrequency (%)
l 4302
18.2%
1956
 
8.3%
u 1698
 
7.2%
p 1654
 
7.0%
/ 1654
 
7.0%
F 1396
 
5.9%
U 1352
 
5.7%
t 1352
 
5.7%
o 1352
 
5.7%
i 1208
 
5.1%
Other values (10) 5724
24.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 13982
59.1%
Uppercase Letter 3352
 
14.2%
Decimal Number 2704
 
11.4%
Space Separator 1956
 
8.3%
Other Punctuation 1654
 
7.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 4302
30.8%
u 1698
 
12.1%
p 1654
 
11.8%
t 1352
 
9.7%
o 1352
 
9.7%
i 1208
 
8.6%
e 906
 
6.5%
g 604
 
4.3%
b 302
 
2.2%
f 302
 
2.2%
Decimal Number
ValueCountFrequency (%)
1 832
30.8%
4 745
27.6%
2 607
22.4%
3 520
19.2%
Uppercase Letter
ValueCountFrequency (%)
F 1396
41.6%
U 1352
40.3%
N 604
18.0%
Space Separator
ValueCountFrequency (%)
1956
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 1654
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 17334
73.3%
Common 6314
 
26.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
l 4302
24.8%
u 1698
 
9.8%
p 1654
 
9.5%
F 1396
 
8.1%
U 1352
 
7.8%
t 1352
 
7.8%
o 1352
 
7.8%
i 1208
 
7.0%
e 906
 
5.2%
N 604
 
3.5%
Other values (4) 1510
 
8.7%
Common
ValueCountFrequency (%)
1956
31.0%
/ 1654
26.2%
1 832
13.2%
4 745
 
11.8%
2 607
 
9.6%
3 520
 
8.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 23648
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l 4302
18.2%
1956
 
8.3%
u 1698
 
7.2%
p 1654
 
7.0%
/ 1654
 
7.0%
F 1396
 
5.9%
U 1352
 
5.7%
t 1352
 
5.7%
o 1352
 
5.7%
i 1208
 
5.1%
Other values (10) 5724
24.2%

filled_up_storage_space_name
Categorical

HIGH CORRELATION  MISSING 

Distinct4
Distinct (%)0.1%
Missing46
Missing (%)1.5%
Memory size24.3 KiB
Filled up every year
1336 
Usually filled up
1002 
Rarely filled up
577 
Never filled up
135 

Length

Max length20
Median length17
Mean length18.036393
Min length15

Characters and Unicode

Total characters55011
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowRarely filled up
2nd rowRarely filled up
3rd rowRarely filled up
4th rowRarely filled up
5th rowNever filled up

Common Values

ValueCountFrequency (%)
Filled up every year 1336
43.2%
Usually filled up 1002
32.4%
Rarely filled up 577
18.6%
Never filled up 135
 
4.4%
(Missing) 46
 
1.5%

Length

2023-12-09T18:34:56.605517image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-09T18:34:56.694398image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
filled 3050
29.1%
up 3050
29.1%
every 1336
12.7%
year 1336
12.7%
usually 1002
 
9.6%
rarely 577
 
5.5%
never 135
 
1.3%

Most occurring characters

ValueCountFrequency (%)
l 8681
15.8%
e 7905
14.4%
7436
13.5%
y 4251
7.7%
u 4052
7.4%
r 3384
 
6.2%
d 3050
 
5.5%
p 3050
 
5.5%
i 3050
 
5.5%
a 2915
 
5.3%
Other values (7) 7237
13.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 44525
80.9%
Space Separator 7436
 
13.5%
Uppercase Letter 3050
 
5.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 8681
19.5%
e 7905
17.8%
y 4251
9.5%
u 4052
9.1%
r 3384
 
7.6%
d 3050
 
6.9%
p 3050
 
6.9%
i 3050
 
6.9%
a 2915
 
6.5%
f 1714
 
3.8%
Other values (2) 2473
 
5.6%
Uppercase Letter
ValueCountFrequency (%)
F 1336
43.8%
U 1002
32.9%
R 577
18.9%
N 135
 
4.4%
Space Separator
ValueCountFrequency (%)
7436
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 47575
86.5%
Common 7436
 
13.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
l 8681
18.2%
e 7905
16.6%
y 4251
8.9%
u 4052
8.5%
r 3384
 
7.1%
d 3050
 
6.4%
p 3050
 
6.4%
i 3050
 
6.4%
a 2915
 
6.1%
f 1714
 
3.6%
Other values (6) 5523
11.6%
Common
ValueCountFrequency (%)
7436
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 55011
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l 8681
15.8%
e 7905
14.4%
7436
13.5%
y 4251
7.7%
u 4052
7.4%
r 3384
 
6.2%
d 3050
 
5.5%
p 3050
 
5.5%
i 3050
 
5.5%
a 2915
 
5.3%
Other values (7) 7237
13.2%

no_people_benefited_by_water_body
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct95
Distinct (%)3.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean191.28941
Minimum1
Maximum97586
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size24.3 KiB
2023-12-09T18:34:56.802484image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median7
Q315
95-th percentile60
Maximum97586
Range97585
Interquartile range (IQR)12

Descriptive statistics

Standard deviation3440.2533
Coefficient of variation (CV)17.984547
Kurtosis550.80154
Mean191.28941
Median Absolute Deviation (MAD)5
Skewness22.836013
Sum592232
Variance11835343
MonotonicityNot monotonic
2023-12-09T18:34:56.919678image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 435
14.1%
10 324
10.5%
4 303
 
9.8%
2 270
 
8.7%
3 267
 
8.6%
25 203
 
6.6%
5 169
 
5.5%
20 128
 
4.1%
15 123
 
4.0%
8 114
 
3.7%
Other values (85) 760
24.5%
ValueCountFrequency (%)
1 435
14.1%
2 270
8.7%
3 267
8.6%
4 303
9.8%
5 169
 
5.5%
6 92
 
3.0%
7 63
 
2.0%
8 114
 
3.7%
9 15
 
0.5%
10 324
10.5%
ValueCountFrequency (%)
97586 1
< 0.1%
90000 1
< 0.1%
80000 1
< 0.1%
72350 1
< 0.1%
54895 1
< 0.1%
50000 1
< 0.1%
35513 1
< 0.1%
22000 1
< 0.1%
12000 1
< 0.1%
5000 2
0.1%

reason_water_body_in_use_name2
Categorical

MISSING 

Distinct8
Distinct (%)0.9%
Missing2174
Missing (%)70.2%
Memory size24.3 KiB
Ground water recharge
529 
Other
129 
Pisciculture
90 
Recreation
64 
Domestic/Drinking
 
50
Other values (3)
60 

Length

Max length21
Median length21
Mean length16.168113
Min length5

Characters and Unicode

Total characters14907
Distinct characters25
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowOther
2nd rowGround water recharge
3rd rowGround water recharge
4th rowOther
5th rowIrrigation

Common Values

ValueCountFrequency (%)
Ground water recharge 529
 
17.1%
Other 129
 
4.2%
Pisciculture 90
 
2.9%
Recreation 64
 
2.1%
Domestic/Drinking 50
 
1.6%
Irrigation 35
 
1.1%
Religious 17
 
0.5%
Industrial 8
 
0.3%
(Missing) 2174
70.2%

Length

2023-12-09T18:34:57.030341image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-09T18:34:57.127052image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
ground 529
26.7%
water 529
26.7%
recharge 529
26.7%
other 129
 
6.5%
pisciculture 90
 
4.5%
recreation 64
 
3.2%
domestic/drinking 50
 
2.5%
irrigation 35
 
1.8%
religious 17
 
0.9%
industrial 8
 
0.4%

Most occurring characters

ValueCountFrequency (%)
r 2527
17.0%
e 2001
13.4%
a 1165
 
7.8%
1058
 
7.1%
t 905
 
6.1%
c 823
 
5.5%
n 736
 
4.9%
u 734
 
4.9%
o 695
 
4.7%
h 658
 
4.4%
Other values (15) 3605
24.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 12827
86.0%
Space Separator 1058
 
7.1%
Uppercase Letter 972
 
6.5%
Other Punctuation 50
 
0.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 2527
19.7%
e 2001
15.6%
a 1165
9.1%
t 905
 
7.1%
c 823
 
6.4%
n 736
 
5.7%
u 734
 
5.7%
o 695
 
5.4%
h 658
 
5.1%
g 631
 
4.9%
Other values (7) 1952
15.2%
Uppercase Letter
ValueCountFrequency (%)
G 529
54.4%
O 129
 
13.3%
D 100
 
10.3%
P 90
 
9.3%
R 81
 
8.3%
I 43
 
4.4%
Space Separator
ValueCountFrequency (%)
1058
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 50
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 13799
92.6%
Common 1108
 
7.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 2527
18.3%
e 2001
14.5%
a 1165
 
8.4%
t 905
 
6.6%
c 823
 
6.0%
n 736
 
5.3%
u 734
 
5.3%
o 695
 
5.0%
h 658
 
4.8%
g 631
 
4.6%
Other values (13) 2924
21.2%
Common
ValueCountFrequency (%)
1058
95.5%
/ 50
 
4.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14907
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 2527
17.0%
e 2001
13.4%
a 1165
 
7.8%
1058
 
7.1%
t 905
 
6.1%
c 823
 
5.5%
n 736
 
4.9%
u 734
 
4.9%
o 695
 
4.7%
h 658
 
4.4%
Other values (15) 3605
24.2%

storage_capacity_change
Real number (ℝ)

HIGH CORRELATION  SKEWED  ZEROS 

Distinct1120
Distinct (%)36.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-918026.57
Minimum-2.6683767 × 109
Maximum11840
Zeros518
Zeros (%)16.7%
Negative2567
Negative (%)82.9%
Memory size24.3 KiB
2023-12-09T18:34:57.263133image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum-2.6683767 × 109
5-th percentile-9000
Q1-1800.5
median-440
Q3-9
95-th percentile0
Maximum11840
Range2.6683885 × 109
Interquartile range (IQR)1791.5

Descriptive statistics

Standard deviation47990893
Coefficient of variation (CV)-52.276148
Kurtosis3086.8457
Mean-918026.57
Median Absolute Deviation (MAD)440
Skewness-55.520423
Sum-2.8422103 × 109
Variance2.3031258 × 1015
MonotonicityNot monotonic
2023-12-09T18:34:57.401119image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 518
 
16.7%
-2 121
 
3.9%
-10 58
 
1.9%
-20 52
 
1.7%
-5 45
 
1.5%
-1000 34
 
1.1%
-30 26
 
0.8%
-500 25
 
0.8%
-400 25
 
0.8%
-2000 24
 
0.8%
Other values (1110) 2168
70.0%
ValueCountFrequency (%)
-2668376700 1
< 0.1%
-81300000 1
< 0.1%
-56400000 1
< 0.1%
-27500002 1
< 0.1%
-1488400 1
< 0.1%
-179800 1
< 0.1%
-122400 1
< 0.1%
-119900 1
< 0.1%
-118608 1
< 0.1%
-109200 1
< 0.1%
ValueCountFrequency (%)
11840 1
< 0.1%
7950 1
< 0.1%
3600 1
< 0.1%
2500 1
< 0.1%
1050 1
< 0.1%
600 1
< 0.1%
562 1
< 0.1%
420 1
< 0.1%
48 1
< 0.1%
12 1
< 0.1%

population_density_benefited
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct215
Distinct (%)6.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.1251011
Minimum0
Maximum222.36472
Zeros3
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size24.3 KiB
2023-12-09T18:34:57.530989image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.1
Q11.1251011
median1.1251011
Q31.1251011
95-th percentile1.1251011
Maximum222.36472
Range222.36472
Interquartile range (IQR)0

Descriptive statistics

Standard deviation3.9905949
Coefficient of variation (CV)3.5468768
Kurtosis3055.3143
Mean1.1251011
Median Absolute Deviation (MAD)0
Skewness55.092668
Sum3483.3129
Variance15.924848
MonotonicityNot monotonic
2023-12-09T18:34:57.656128image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1.125101077 2789
90.1%
0.12 8
 
0.3%
0.001 7
 
0.2%
1 7
 
0.2%
0.4 7
 
0.2%
0.15 7
 
0.2%
0.06 5
 
0.2%
0.1 5
 
0.2%
0.5 5
 
0.2%
2 4
 
0.1%
Other values (205) 252
 
8.1%
ValueCountFrequency (%)
0 3
0.1%
4 × 10-51
 
< 0.1%
5 × 10-51
 
< 0.1%
7.5 × 10-51
 
< 0.1%
8.33333 × 10-51
 
< 0.1%
9.09091 × 10-51
 
< 0.1%
0.0001 1
 
< 0.1%
0.000111111 2
0.1%
0.000166667 1
 
< 0.1%
0.000174359 1
 
< 0.1%
ValueCountFrequency (%)
222.364725 1
< 0.1%
5.00625 1
< 0.1%
4.55 1
< 0.1%
4.273659199 1
< 0.1%
4.02 1
< 0.1%
4 2
0.1%
2.915 1
< 0.1%
2.5 1
< 0.1%
2.4 1
< 0.1%
2.375 1
< 0.1%

Interactions

2023-12-09T18:34:51.345322image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:41.534410image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:42.622988image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:43.719195image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:44.644487image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:45.650895image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:46.708428image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:47.776987image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:48.666084image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:49.587744image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:50.470350image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:51.427010image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:41.630939image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:42.720305image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:43.821402image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:44.726339image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:45.744945image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:46.800852image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:47.855847image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:48.751192image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:49.672724image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:50.552220image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:51.510323image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:41.728575image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:42.819311image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:43.922484image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:44.805408image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:45.843577image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:46.893170image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:47.937698image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:48.838804image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:49.758919image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:50.631057image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:51.595368image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:41.833223image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:42.922899image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:44.006617image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:44.885603image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:45.939189image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:46.989378image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:48.020222image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:48.926022image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:49.849847image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:50.710758image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:51.672240image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:41.926719image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:43.026519image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:44.086608image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:44.967866image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:46.041255image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:47.087353image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:48.093686image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:49.009149image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:49.932205image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:50.786765image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:51.750542image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:42.025624image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:43.121674image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:44.161769image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:45.067765image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:46.133205image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:47.175401image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:48.165989image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:49.088048image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:50.011055image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:50.858807image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:51.821689image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:42.115247image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:43.211491image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:44.230729image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:45.168169image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:46.224383image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:47.424902image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:48.241020image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:49.163012image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:50.078770image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:50.927949image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:51.902928image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:42.215854image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:43.314332image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:44.314131image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:45.262008image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:46.329473image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:47.497704image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:48.323463image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:49.248997image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:50.156615image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:51.009120image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:51.986828image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:42.321932image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:43.416393image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:44.397775image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:45.361776image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:46.426952image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:47.569973image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:48.410044image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:49.331032image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:50.240116image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:51.094012image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:52.069133image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:42.424634image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:43.524418image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:44.482102image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:45.459852image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:46.525344image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:47.638172image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:48.497661image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:49.418158image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:50.315652image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:51.180756image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:52.152953image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:42.527814image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:43.623600image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:44.567536image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:45.556636image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:46.620350image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:47.709183image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:48.584560image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:49.505747image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:50.396042image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-09T18:34:51.263070image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Correlations

2023-12-09T18:34:57.754658image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Area_TypeDistrict NameOriginal_Storage_CapacityPresent_Storage_CapacityReason_for_Water_Body_UseRenovation_YearRepair_Renovation_StatusWater_Body_NatureWater_Body_TypeWater_body_in_useconstrucion_yearconstruction_costdf_indexfilled_up_storage_namefilled_up_storage_space_namelevel_0no_people_benefited_by_water_bodypopulation_density_benefitedreason_water_body_in_use_name2renovation_coststorage_capacity_change
Area_Type1.0000.0970.1430.1260.150-0.1370.0000.0840.1040.043-0.0640.0290.0160.0480.0000.0160.050-0.0560.076-0.009-0.087
District Name0.0971.0000.2540.2530.467-0.3620.1250.7560.3980.563-0.193-0.2610.0250.3600.3720.025-0.071-0.0660.363-0.638-0.220
Original_Storage_Capacity0.1430.2541.0000.9470.014-0.2660.0000.0000.1870.000-0.220-0.164-0.0060.0030.000-0.0060.2960.0230.049-0.031-0.829
Present_Storage_Capacity0.1260.2530.9471.0000.014-0.2450.0000.0000.1870.000-0.167-0.0870.0040.0030.0000.0040.3210.0370.049-0.011-0.695
Reason_for_Water_Body_Use0.1500.4670.0140.0141.000-0.1750.0940.7190.3800.9990.106-0.2210.0610.3350.4110.061-0.559-0.0640.274-0.2090.290
Renovation_Year-0.137-0.362-0.266-0.245-0.1751.0000.4210.3140.2330.3630.5230.3170.0900.2130.1370.0900.165-0.4230.1830.4160.257
Repair_Renovation_Status0.0000.1250.0000.0000.0940.4211.0000.0000.1190.000-0.008-0.0200.0100.0000.0000.0100.033-0.0850.0000.1510.042
Water_Body_Nature0.0840.7560.0000.0000.7190.3140.0001.0000.3770.219NaNNaN-0.0130.2710.378-0.0130.3710.1490.1720.122-0.516
Water_Body_Type0.1040.3980.1870.1870.3800.2330.1190.3771.0000.1500.0790.360-0.0430.1070.049-0.043-0.0320.0470.1810.0830.477
Water_body_in_use0.0430.5630.0000.0000.9990.3630.0000.2190.1501.000-0.1070.206-0.0440.6030.645-0.0440.537-0.0760.1410.113-0.002
construcion_year-0.064-0.193-0.220-0.1670.1060.523-0.008NaN0.079-0.1071.0000.1300.0100.1090.1100.010-0.1990.3580.1060.2580.203
construction_cost0.029-0.261-0.164-0.087-0.2210.317-0.020NaN0.3600.2060.1301.000-0.0080.0000.038-0.0080.1310.1500.1240.5990.263
df_index0.0160.025-0.0060.0040.0610.0900.010-0.013-0.043-0.0440.010-0.0081.0000.0850.0881.000-0.0130.0100.1210.1820.002
filled_up_storage_name0.0480.3600.0030.0030.3350.2130.0000.2710.1070.6030.1090.0000.0851.0000.6240.014-0.131-0.2240.134-0.080-0.076
filled_up_storage_space_name0.0000.3720.0000.0000.4110.1370.0000.3780.0490.6450.1100.0380.0880.6241.000-0.003-0.152-0.1200.164-0.2230.060
level_00.0160.025-0.0060.0040.0610.0900.010-0.013-0.043-0.0440.010-0.0081.0000.014-0.0031.000-0.0130.0100.1210.1820.002
no_people_benefited_by_water_body0.050-0.0710.2960.321-0.5590.1650.0330.371-0.0320.537-0.1990.131-0.013-0.131-0.152-0.0131.000-0.0310.1070.125-0.232
population_density_benefited-0.056-0.0660.0230.037-0.064-0.423-0.0850.1490.047-0.0760.3580.1500.010-0.224-0.1200.010-0.0311.0000.049-0.5140.020
reason_water_body_in_use_name20.0760.3630.0490.0490.2740.1830.0000.1720.1810.1410.1060.1240.1210.1340.1640.1210.1070.0491.0000.2300.062
renovation_cost-0.009-0.638-0.031-0.011-0.2090.4160.1510.1220.0830.1130.2580.5990.182-0.080-0.2230.1820.125-0.5140.2301.0000.196
storage_capacity_change-0.087-0.220-0.829-0.6950.2900.2570.042-0.5160.477-0.0020.2030.2630.002-0.0760.0600.002-0.2320.0200.0620.1961.000

Missing values

2023-12-09T18:34:52.278884image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-09T18:34:52.551244image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-09T18:34:52.727638image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

level_0df_indexArea_TypeState NameDistrict NameWater_Body_TypeWater_body_in_useReason_for_Water_Body_UseWater_Body_Natureconstrucion_yearconstruction_costRenovation_Yearrenovation_costRepair_Renovation_StatusOriginal_Storage_CapacityPresent_Storage_Capacityfilled_up_storage_namefilled_up_storage_space_nameno_people_benefited_by_water_bodyreason_water_body_in_use_name2storage_capacity_changepopulation_density_benefited
000RuralUTTARAKHANDUDHAM SINGH NAGARPonds0otherNaturalNaNNaNNaNNaN02348115654Nil/Negligible filled upRarely filled up4.0NaN-78271.125101
111RuralUTTARAKHANDUDHAM SINGH NAGARPonds0otherMan-made2019.080000.0NaNNaN0632432Nil/Negligible filled upRarely filled up4.0NaN-2001.125101
222RuralUTTARAKHANDUDHAM SINGH NAGARPonds0otherMan-made2017.070000.0NaNNaN046004500Upto 1/4Rarely filled up4.0NaN-1001.125101
333RuralUTTARAKHANDUDHAM SINGH NAGARPonds0otherMan-made2017.050000.0NaNNaN030002400Upto 1/2Rarely filled up2.0NaN-6001.125101
444RuralUTTARAKHANDHARIDWARPonds1Ground water rechargeNaturalNaNNaNNaNNaN0990540Nil/Negligible filled upNever filled up2.0NaN-4501.125101
555RuralUTTARAKHANDHARIDWARPonds1Ground water rechargeNaturalNaNNaNNaNNaN0675240Nil/Negligible filled upNever filled up2.0NaN-4351.125101
666RuralUTTARAKHANDUDHAM SINGH NAGARPonds0otherNaturalNaNNaNNaNNaN066752670FullUsually filled up3.0NaN-40051.125101
777UrbanUTTARAKHANDHARIDWARPonds1Ground water rechargeNaturalNaNNaNNaNNaN01453613000Upto 1/2Rarely filled up20.0NaN-15361.125101
888RuralUTTARAKHANDUDHAM SINGH NAGARPonds0otherNaturalNaNNaNNaNNaN076142538Upto 1/2Rarely filled up1.0NaN-50761.125101
999RuralUTTARAKHANDUDHAM SINGH NAGARPonds0otherNaturalNaNNaNNaNNaN023581179Upto 1/2Rarely filled up3.0NaN-11791.125101
level_0df_indexArea_TypeState NameDistrict NameWater_Body_TypeWater_body_in_useReason_for_Water_Body_UseWater_Body_Natureconstrucion_yearconstruction_costRenovation_Yearrenovation_costRepair_Renovation_StatusOriginal_Storage_CapacityPresent_Storage_Capacityfilled_up_storage_namefilled_up_storage_space_nameno_people_benefited_by_water_bodyreason_water_body_in_use_name2storage_capacity_changepopulation_density_benefited
308630863086RuralUTTARAKHANDDEHRADUNPonds0otherMan-made2000.0150000.0NaNNaN0120000100Nil/Negligible filled upNever filled up2.0NaN-1199001.125101
308730873087RuralUTTARAKHANDDEHRADUNPonds1IrrigationMan-made2001.050000.0NaNNaN02100021000Upto 3/4Filled up every year8.0NaN01.125101
308830883088RuralUTTARAKHANDNANITALPonds1PiscicultureNaturalNaNNaN2016.08424.002420Upto 3/4Usually filled up5.0Domestic/Drinking-40.002374
308930893089RuralUTTARAKHANDHARIDWARPonds0otherNaturalNaNNaNNaNNaN024001000Nil/Negligible filled upNever filled up2.0NaN-14001.125101
309030903090RuralUTTARAKHANDHARIDWARPonds0otherNaturalNaNNaNNaNNaN01500600Nil/Negligible filled upNever filled up1.0NaN-9001.125101
309130913091RuralUTTARAKHANDHARIDWARPonds1Ground water rechargeNaturalNaNNaNNaNNaN0600300FullFilled up every year4.0NaN-3001.125101
309230923092RuralUTTARAKHANDUDHAM SINGH NAGARPonds1PiscicultureMan-made2000.050000.02012.05000.0050404000Upto 3/4Usually filled up15.0Ground water recharge-10400.800000
309330933093RuralUTTARAKHANDUDHAM SINGH NAGARPonds1PiscicultureMan-made2000.030000.02012.03000.0017821200Upto 3/4Usually filled up11.0Ground water recharge-5820.400000
309430943094RuralUTTARAKHANDUDHAM SINGH NAGARPonds1PiscicultureMan-made2000.0100000.02012.010000.002856921000Upto 3/4Usually filled up20.0Ground water recharge-75692.100000
309530953095RuralUTTARAKHANDUDHAM SINGH NAGARPonds1PiscicultureMan-made2000.080000.02012.08000.001544412000Upto 3/4Usually filled up16.0Ground water recharge-34441.500000